53 research outputs found

    BIOptimus: Pre-training an Optimal Biomedical Language Model with Curriculum Learning for Named Entity Recognition

    Full text link
    Using language models (LMs) pre-trained in a self-supervised setting on large corpora and then fine-tuning for a downstream task has helped to deal with the problem of limited label data for supervised learning tasks such as Named Entity Recognition (NER). Recent research in biomedical language processing has offered a number of biomedical LMs pre-trained using different methods and techniques that advance results on many BioNLP tasks, including NER. However, there is still a lack of a comprehensive comparison of pre-training approaches that would work more optimally in the biomedical domain. This paper aims to investigate different pre-training methods, such as pre-training the biomedical LM from scratch and pre-training it in a continued fashion. We compare existing methods with our proposed pre-training method of initializing weights for new tokens by distilling existing weights from the BERT model inside the context where the tokens were found. The method helps to speed up the pre-training stage and improve performance on NER. In addition, we compare how masking rate, corruption strategy, and masking strategies impact the performance of the biomedical LM. Finally, using the insights from our experiments, we introduce a new biomedical LM (BIOptimus), which is pre-trained using Curriculum Learning (CL) and contextualized weight distillation method. Our model sets new states of the art on several biomedical Named Entity Recognition (NER) tasks. We release our code and all pre-trained model

    Cell wall functional activity and metal accumulation of halophytic plant species Plantago maritima and Triglochin maritima on the White Sea littoral zone (NW Russia)

    Get PDF
    The presented study supplements the knowledge on ion-exchange capacity, swelling capacity (elasticity) of the plant cell wall, and the accumulation of heavy metals in halophytic species Plantago maritima and Triglochin maritima in the tidal zone of the White Sea western coast. The littoral soils of the coastal territories are sandy or rocky-sandy, medium and slightly saline with poor content of organic substances, Mn, Zn, Ni, and Pb. Studied soils are considered as uncontaminated by heavy metals because they contain background amounts of Fe and Cu. Sea water is significantly polluted by Fe (3.8 MPC) and Ni (55 MPC), has poor content of Zn and Cu and background level of Pb and Mn. The coastal dominant plant species P. maritima and T. maritima were characterized by intensive metals accumulation which was reflected in the coefficient of biological absorption (CBA) of metal by a whole plant. For P. maritima the following metal accumulation series was obtained: Cu (3.29)> Zn (2.81)> Ni (1.57)> Pb (1.30)> Mn (1.21)> Fe (0.97), and for T. maritima: Ni (3.80)> Fe (2.08)> Cu (1.91)> Zn (1.84)> Pb (1.51)> Mn (1.31). Roots accumulated 50–70% of Ni, Cu, Zn, Pb and Mn of the total metal content in the plant while leaves and stems contained 30–50%. Fe was allocated mainly in the roots (80%). The ion-exchange capacity of the plant cell wall for P. maritima and T. maritima was established as follows correspondingly: 3570–3700 and 2710–3070 μmol g-1 dry cell weight per leaf; 2310–2350 and 1160–1250 μmol g-1 dry cell weight per root

    Microcystis aeruginosa and M. wesenbergii Were the Primary Planktonic Microcystin Producers in Several Bulgarian Waterbodies (August 2019)

    Get PDF
    The rising interest in harmful cyanoprokaryote blooms promotes an increase of phycological and ecological research on potentially toxic species and their hazardous substances. The present study aimed to identify the main microcystin (MC) producers and their contribution to the phytoplankton of shallow waterbodies in Bulgaria, applying different methods. The sampling was performed in August 2019 in nine lakes and reservoirs, two of which (reservoirs Kriva Reka and Izvornik 2) were studied for the first time. The high contribution of cyanoprokaryotes to the total species composition and phytoplankton abundance was proved by light microscopic (LM) observations and HPLC analysis of marker pigments. The LM identification of potential MC-producers was supported by PCR amplification of mcyE and mcyB genes. The MCs amounts, detected by HPLC-DAD, varied by sites with a range from undetectable concentrations to 0.46 g L 1 with only one recorded variant, namely MC-LR. It was found only in the reservoirs Mandra and Durankulak, while toxigenic MC-strains were obtained by PCR from five more waterbodies. Both LM and PCR demonstrated that the MC-producers were Microcystis aeruginosa and M. wesenbergii, despite their occurrence in low amounts (<0.5–5% of the total biomass) when filamentous cyanoprokaryotes dominated.Peer reviewe

    Simultaneous down-regulation of tumor suppressor genes RBSP3/CTDSPL, NPRL2/G21 and RASSF1A in primary non-small cell lung cancer

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>The short arm of human chromosome 3 is involved in the development of many cancers including lung cancer. Three bona fide lung cancer tumor suppressor genes namely <it>RBSP3 </it>(AP20 region),<it>NPRL2 </it>and <it>RASSF1A </it>(LUCA region) were identified in the 3p21.3 region. We have shown previously that homozygous deletions in AP20 and LUCA sub-regions often occurred in the same tumor (P < 10<sup>-6</sup>).</p> <p>Methods</p> <p>We estimated the quantity of <it>RBSP3, NPRL2, RASSF1A, GAPDH, RPN1 </it>mRNA and <it>RBSP3 </it>DNA copy number in 59 primary non-small cell lung cancers, including 41 squamous cell and 18 adenocarcinomas by real-time reverse transcription-polymerase chain reaction based on TaqMan technology and relative quantification.</p> <p>Results</p> <p>We evaluated the relationship between mRNA level and clinicopathologic characteristics in non-small cell lung cancer. A significant expression decrease (≥2) was found for all three genes early in tumor development: in 85% of cases for <it>RBSP3</it>; 73% for <it>NPRL2 </it>and 67% for <it>RASSF1A </it>(P < 0.001), more strongly pronounced in squamous cell than in adenocarcinomas. Strong suppression of both, <it>NPRL2 </it>and <it>RBSP3 </it>was seen in 100% of cases already at Stage I of squamous cell carcinomas. Deregulation of <it>RASSF1A </it>correlated with tumor progression of squamous cell (P = 0.196) and adenocarcinomas (P < 0.05). Most likely, genetic and epigenetic mechanisms might be responsible for transcriptional inactivation of <it>RBSP3 </it>in non-small cell lung cancers as promoter methylation of <it>RBSP3 </it>according to NotI microarrays data was detected in 80% of squamous cell and in 38% of adenocarcinomas. With NotI microarrays we tested how often LUCA (<it>NPRL2, RASSF1A</it>) and AP20 (<it>RBSP3</it>) regions were deleted or methylated in the same tumor sample and found that this occured in 39% of all studied samples (P < 0.05).</p> <p>Conclusion</p> <p>Our data support the hypothesis that these TSG are involved in tumorigenesis of NSCLC. Both genetic and epigenetic mechanisms contribute to down-regulation of these three genes representing two tumor suppressor clusters in 3p21.3. Most importantly expression of <it>RBSP3, NPRL2 </it>and <it>RASSF1A </it>was simultaneously decreased in the same sample of primary NSCLC: in 39% of cases all these three genes showed reduced expression (P < 0.05).</p

    Sister chromatid exchanges and micronuclei in peripheral lymphocytes of shoe factory workers exposed to solvents.

    Get PDF
    We examined sister chromatid exchanges (SCEs) and micronuclei (MN; cytokinesis-block method) in cultured peripheral lymphocytes from 52 female workers of two shoe factories and from 36 unexposed age- and sex-matched referents. The factory workers showed an elevated level of urinary hippuric acid, a biomarker of toluene exposure, and workplace air contained high concentrations of various organic solvents such as toluene, gasoline, acetone, and (in one of the plants only) ethylacetate and methylenediphenyl diisocyanate. The shoe factory workers showed a statistically significant higher frequency of micronucleated binucleate lymphocytes in comparison with the referents. This finding agreed with three preliminary MN determinations (each comprising 27-32 shoe workers and 16-20 controls) performed in one of the plants 2-5 years earlier. The shoe factory workers also had a lower average level of blood hemoglobin than the referents. In contrast, no difference was found between the groups in SCE analysis. Smokers showed significantly higher mean frequencies of SCEs per cell and high frequency cells (HFC) than nonsmokers. Aging was associated with increased MN rates and reduced cell proliferation. Polymorphism of the glutathione S-transferase M1 gene (GSTM1) did not affect the individual level of SCEs; but in smoking shoe workers an effect of the occupational exposure on the frequency of micronucleated cells could be seen only in GSTM1 null subjects. The low prevalence of the glutathione S-transferase T1 (GSTT1) null genotype precluded the evaluation of the influence of GSTT1 polymorphism. Our results show that the shoe factory workers have experienced genotoxic exposure, which is manifest as an increase in the frequency of MN, but not of SCEs, in peripheral lymphocytes. The exposures responsible for the MN induction could not be identified with certainty, but exposure to benzene in gasoline and methylenediphenyl diisocyanate may explain some of the findings

    Metabolism within the tumor microenvironment and its implication on cancer progression: an ongoing therapeutic target

    Get PDF
    Since reprogramming energy metabolism is considered a new hallmark of cancer, tumor metabolism is again in the spotlight of cancer research. Many studies have been carried out and many possible therapies have been developed in the last years. However, tumor cells are not alone. A series of extracellular components and stromal cells, such as endothelial cells, cancer-associated fibroblasts, tumor-associated macrophages and tumor-infiltrating T cells, surround tumor cells in the so-called tumor microenvironment. Metabolic features of these cells are being studied in deep in order to find relationships between metabolism within the tumor microenvironment and tumor progression. Moreover, it cannot be forgotten that tumor growth is able to modulate host metabolism and homeostasis, so that tumor microenvironment is not the whole story. Importantly, the metabolic switch in cancer is just a consequence of the flexibility and adaptability of metabolism and should not be surprising. Treatments of cancer patients with combined therapies including anti-tumor agents with those targeting stromal cell metabolism, anti-angiogenic drugs and/or immunotherapy are being developed as promising therapeutics.Mª Carmen Ocaña is recipient of a predoctoral FPU grant from the Spanish Ministry of Education, Culture and Sport. Supported by grants BIO2014-56092-R (MINECO and FEDER), P12-CTS-1507 (Andalusian Government and FEDER) and funds from group BIO-267 (Andalusian Government). The "CIBER de Enfermedades Raras" is an initiative from the ISCIII (Spain). The funders had no role in the study design, data collection and analysis, decision to publish or preparation of the manuscript

    High Mutability of the Tumor Suppressor Genes RASSF1 and RBSP3 (CTDSPL) in Cancer

    Get PDF
    BACKGROUND:Many different genetic alterations are observed in cancer cells. Individual cancer genes display point mutations such as base changes, insertions and deletions that initiate and promote cancer growth and spread. Somatic hypermutation is a powerful mechanism for generation of different mutations. It was shown previously that somatic hypermutability of proto-oncogenes can induce development of lymphomas. METHODOLOGY/PRINCIPAL FINDINGS:We found an exceptionally high incidence of single-base mutations in the tumor suppressor genes RASSF1 and RBSP3 (CTDSPL) both located in 3p21.3 regions, LUCA and AP20 respectively. These regions contain clusters of tumor suppressor genes involved in multiple cancer types such as lung, kidney, breast, cervical, head and neck, nasopharyngeal, prostate and other carcinomas. Altogether in 144 sequenced RASSF1A clones (exons 1-2), 129 mutations were detected (mutation frequency, MF = 0.23 per 100 bp) and in 98 clones of exons 3-5 we found 146 mutations (MF = 0.29). In 85 sequenced RBSP3 clones, 89 mutations were found (MF = 0.10). The mutations were not cytidine-specific, as would be expected from alterations generated by AID/APOBEC family enzymes, and appeared de novo during cell proliferation. They diminished the ability of corresponding transgenes to suppress cell and tumor growth implying a loss of function. These high levels of somatic mutations were found both in cancer biopsies and cancer cell lines. CONCLUSIONS/SIGNIFICANCE:This is the first report of high frequencies of somatic mutations in RASSF1 and RBSP3 in different cancers suggesting it may underlay the mutator phenotype of cancer. Somatic hypermutations in tumor suppressor genes involved in major human malignancies offer a novel insight in cancer development, progression and spread
    corecore